Knowledge Lean Word-Sense Disambiguation
نویسندگان
چکیده
We present a corpus based approach to word sense disambiguation that only requires information that can be automatically extracted from untagged text We use unsupervised techniques to estimate the pa rameters of a model describing the conditional distri bution of the sense group given the known contextual features Both the EM algorithm and Gibbs Sampling are evaluated to determine which is most appropriate for our data We compare their disambiguation ac curacy in an experiment with thirteen di erent words and three feature sets Gibbs Sampling results in small but consistent improvement in disambiguation accu racy over the EM algorithm
منابع مشابه
Knowledge-Rich Word Sense Disambiguation Rivaling Supervised Systems
One of the main obstacles to highperformance Word Sense Disambiguation (WSD) is the knowledge acquisition bottleneck. In this paper, we present a methodology to automatically extend WordNet with large amounts of semantic relations from an encyclopedic resource, namely Wikipedia. We show that, when provided with a vast amount of high-quality semantic relations, simple knowledge-lean disambiguati...
متن کاملDemo: Enriching Text with RDF/OWL Encoded Senses
This demo paper describes an extension of the Enrycher text enhancement system, which annotates words in context, from a text fragment, with RDF/OWL encoded senses from WordNet and OpenCyc. The extension is based on a general purpose disambiguation algorithm which takes advantage of the structure and/or content of knowledge resources, reaching state-of-the-art performance when compared to other...
متن کامل6 Unsupervised corpus - based methods for WSD
This chapter focuses on unsupervised corpus-based methods of word sense discrimination that are knowledge-lean, and do not rely on external knowledge sources such as machine readable dictionaries, concept hierarchies, or sense-tagged text. They do not assign sense tags to words; rather, they discriminate among word meanings based on information found in unannotated corpora. This chapter reviews...
متن کامل6 Unsupervised Corpus - Based Methods for WSD 6 . 1
This chapter focuses on unsupervised corpus-based methods of word sense discrimination that are knowledge-lean, and do not rely on external knowledge sources such as machine readable dictionaries, concept hierarchies, or sense-tagged text. They do not assign sense tags to words; rather, they discriminate among word meanings based on information found in unannotated corpora. This chapter reviews...
متن کاملEBL-Hope: Multilingual Word Sense Disambiguation Using a Hybrid Knowledge-Based Technique
We present a hybrid knowledge-based approach to multilingual word sense disambiguation using BabelNet. Our approach is based on a hybrid technique derived from the modified version of the Lesk algorithm and the Jiang & Conrath similarity measure. We present our system's runs for the word sense disambiguation subtask of the Multilingual Word Sense Disambiguation and Entity Linking task of SemEva...
متن کامل